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THE SELECTION AND USE OF RESPONSE VARIABLES IN 
EDUCATIONAL EXPERIMENTS USING MULTIVARIATE ANALYSIS WERE 
CONSIDERED AND ASSESSED. RESPONSE VARIABLES WERE DEEMED 
CENTRAL CONCERNS FOR EITHER THEORETICAL OR PRACTICALLY 
ORIENTED RESEARCH, AND THEIR COMPLEXITY WAS DEALT WITH UNDER 
THE HEADINGS OF APTITUDE INPUT MEASURES, REPEATED LEARNING 
MEASURES, MULTIPLE LEARNING MEASURES, AND APTITUDE OUTPUT 
MEASURES. IT WAS CONCLUDED THAT, IF INSTRUCTIONAL METHODS AND 
PROCESSES ARE TO BE UNDERSTOOD AND IMPROVED, RESPONSE 
COMPLEXITY FROBLEMS MUST BE SOLVED. EARLY EMPHASIS SHOULD BE 
PLACED ON THE ASSESSMENT OF (1) THE NATURE OF STUDENT 
APTITUDES AS THEY INTERACT WITH TEACHING AND LEARNING 
PROCESSES, (2) THE COURSE OR PATTERNING OF THESE PROCESSES 
ACROSS THE OCCASIONS ON WHICH THEY OCCUR, (3) THE EXTENSITY 
OF INSTRUCTIONAL EFFECTS AS WELL AS THE INTENSITY OF ANY ONE 
EFFECT, AND (4) THE ENDURING CHANGES AND SUBSEQUENT EFFECTS 
OF LEARNING RELATIVE TO THE FATTERN OF INTELLECTUAL 
DEVELOPMENT IN GENERAL. THIS PAPER WAS PRESENTED AT THE 
ANNUAL STATE CONFERENCE ON EDUCATIONAL RESEARCH (18TH, SAN 
FRANCISCO, NOVEMBER 18, 1966). (GD) 
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The phenomena of classroom teaching and learning are obviously 
complex, so complex in fact that it is unreasonable to expect any one 
study to deal effectively with more than a small portion of the poten- 
tiaily relevant variables. With this recognition and with the help of 
Fisherian design principles, educational experiments have increasingly 
been conceived as multivariate investigations. Usually, several 
independent stimulus variables are manipulated and their joint effects 
on a single learning criterion are assessed. Such designs presume 
stimulus complexity but they ignore nhe possibility of response 
complexity. Experiments can also be designed to incorporate several 
different or repeated dependent response variables and/or to include, 
as additional independent variables, selections frcai a special class 
of antecedent response characteristics, here called "aptitudes”. In 
classroom research, where students can be expected to bring widely 
different patterns of relevant prior experience to the experiment, and 
particularly in curriculum evaluation studies, where many different 
learning criteria might also be appropriately applied, experimental 
designs which are multivariate in this latter sense nay be particularly 
important. The present paper considers the selection and use of 
response variables in the design of such investigations. 

Four classes of response measurements can be distinguished. 

First, there are antecedent response variables, represented usually 
by scores on aptitude tests administered prior to an instructional 
treatment but including also sex, age, or any other index of a poten- 
tially important human difference* Second, the traditional concept! cr 
of a single achievement test straddling an instructional treatment can 
sometimes be extended to permit repeated or intermediate measures at 
several points as instruction proceeds. This possibility can in turn 
be expanded to produce a third class of variables representing the many 
different learning effects to be assessed during instruction. Finally, 
there are the more remote or enduring effects of instruction as 
reflected in tests of retention, transfer and aptitudinal or 
attltudinal change. 



These four kinds of measures are shown in Table I. Listed 
beneath each are some of the methods by which such variables have been 
or could be treated for the* purposes of design and statistical analysis* 
The list is not exhaustive: it does not presume to survey in any 

general way the relevant statistical methods and design considerations 
already treated i* detail by Tatsuoka and Tiedemsn (1963) or by 
Campbell and Stanley (1963), It is intended rather to emphasize some 
points not rade explicit in those two major sources and to publicize 
some more recent and ongoing developments. The table will not be 
described in detail but will be used instead to organize some more 
general comments about problems and possibilities for classroom 
rr search with respect to each class of variables. 



Insert Table I about here 



Aptitude Input Measures . Individual differences among students 
on aptitude variables have traditionally been viewed as a source of 
error to be controlled, in earlier days by matching procedures and, 
more recently , through the use of covariance analysis. A third 
view suggests that differential aptitudes should often be systematically 
included in experiments rather than being covaried out of them. Shreds 
of evidence so far available from the many studies that ha re inciden- 
tally correlated aptitude variables with learning under different 
instructional methods, or from the few investigations aimed specifically 
in this direction, indicate that the possibility of aptitude- treatment 
interactions deserves serious consideration. The demonstration of 
intersecting regression lines for the two treatments, that is, a 
disordlnal interaction, implies that one instructional treatment is best 
for one group of students while another treatment is best for a 
different group. Such findings have practical as well as theoretical 
significance. They provide decision rules for the assignment of 
students to different paths toward the same instructional objective 
and they provide insights into the nature of aptitude functioning. For 
a fuller discussion of the importance of disordlnal interactions, see 
Cronbach (1957) and Cronbach and Gleser (1965). 
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TABLE I 



Dependent Response Variables 

Variables 



Aptitude 


p — * Occasions 


Variables^ii 


Aptitude 


Input 


Repeated 


Multiple 


Output 


Measures 


Learning 


Learning 


Measures 




Measures 


Measures 

' 
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INCOMPLETE 
CONTROL OR 
ANALYSIS 



Matching 



*3* 



CONTROL 



ANCOVA 

MANCOVA 



(Winer, 



Simple gain 
Relative gain 
Residual gain 
(DuBois, 1962) 



Sequential 
i! ANCOVA 



1962) 



Arbitrary 

composite 

Separate ANOVA 



Post-treatment ANCOVA 
(Gourlay, 1953) 
(Cox, 1958) 



ANALYSIS 

OF 

VARIANCE 



REGRESSION 

ANALYSIS 



Treatment X 
levels ANOVA 

(Lindquist, 1953) 
(Stanley, 1960) 
(Page, 1965) 



Multiple 

regression 

analysis 



Repeated measures 
ANOVA and 
Trend analysis 
(Gaito and Wiley, 
and Bock, in 
Harris, 1963) 

Factor analysis 
of variance 



2 

Hotelling's T 
Dyadic ANOVA (Tukey, 1949) 
ANOVA with multiple dependent 
variables 

(Roy and Gnanadesikan, 1959) 
(Tukey, 1962) 

(Bock, in press) 

(Bock and Haggard, in press) 



(Go Hob, 1966) 



Generalized 
learning curve 
components 
(Tucker, 1960) 



Multiple discriminant function 

(Cooley and Lohnes, 1962) 



Cattell's Covariation Chart 
(Cattail, 1959) 

Three-mode factor analysis 
(Tucker, in Harris, 1963) 



Canonical correlation and factor analysis (Harris in Harris, 1963) 
Inter-battery factor analysis (Kristoff, 1965) 
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It should be noted that main effects in an analysis of 
variance are meaningless in the presence of such interaction and 
variables not included in the experiment have no opportunity to 
demonstrate their interactive effects. In view of the large number of 
nonsignificant overall treatment comparisons obtained in instructional 
research, it is appropriate to ask how many studies mask such inter- 
actions by ignoring aptitudes and averaging over them. One can also 
ask how many reportedly significant main effects are misleading, for 
the same reason. 

The manner in which aptitude variables can be used in experiments 
depends upon several considerations* If it is possible to assign 
individual students randomly to treatments and if one aptitude variable 
or several uncorrelated aptitude variables are to be used, then treatments 
X levels designs as discussed by Lindquist (1953) and Stanley (1960) are 
appropriate. If one must deal with intact groups, such as classes, then 
an approach suggested by Page (1965) is possible. The case of two or 
more correlated variables with either individual or group randomization 
has not been formally worked out, though with individual random assign- 
ment some useful approximations are possible. Various correlational 
approaches are also available. 

The selection of likely aptitude variables, however, rests on an 
analysis of subtle differences existing between the instructional 
treatments under consideration. At present there are few guidelines, 
though one hypothesis suggests that the status variables used most 
frequently in the past (e.g. sex, age, and IQ) may not be the most 
useful for this purpose. Of greater value may be the more narrowly 
defined intellectual and personality characteristics and, perhaps, 
members of a newer class of individual variables referred to as 
cognitive styles and preferences. In some exemplary studies, aptitude 
measures have been developed specifically for the particular instruc- 
tional comparisons of interest. It is quite possible that the most 
useful aptitudes for learning research may have no precedents in the 
literature of differential psychology. 




Repeated Learning Measures . Laboratory experimentation on 
learning has traditionally used practice as the continuum along which 
knowledge or skill acquisition could be measured. Although "improvement 
with practice" or "change with repetition" are incomplete as definitions 
of learning, practice curves are thought to show important features of 
learning phenomena and are, therefore, commonly used in presenting data 
from the laboratory. In research on formal instruction, however, 
investigators have normally been unable to treat their data in this way. 
The pre vs. posttest comparisons or gains scores derived from these 
tests have had to suffice. Thus, the presence or absence of some 
degree of learning has been studied, but there have been few attempts 
to investigate the course of acquisition. 

Acknowledging that more than two test administrations may be 
difficult to obtain, it is nonetheless true that educational researchers 
have not really considered the possibility. They have continued using 
the difference score despite its apparent faults. Consequently, the 
effects of repetitive achievement testing are not known. There have 
also been few attempts to build truly equivalent forms of aptitude or 
achievement tests. The problems of measuring change form the subject 
of a whole book recently edited by Harris (1963) and cannot be elaborated 
upon here. One observation that can be made, however, is that powerful 
methodology is available for the analysis of repeated measures data, if 
such data can be obtained. Even without a paries of formally equivalent 
measures available, there is still the possibility of serial measure- 
raents of a rougher sort, perhaps using unit tests, and various kinds 
of correlational analyses. 

It is unlikely that learning will be understood, or curricula 
adequately evaluated, by considering global instructional treatments 
as black boxes with achievement tests as measures of inputs and out- 
puts. A more analytical view must be adopted which includes ideas about 
the sequencing, staging, or patterning of learning phenomena. Acquisi- 
tion stage measures may then be used in combination with aptitude input 
measures to provide specific diagnostic clues for curriculum revision 
as well as a more basic understanding of the complexity of educational 
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processes in general# Comparisons between instructional treatments 
on a unit by unit, or even item by item, basis clearly offer a finer 
grain analysis than global gain scores. In this connection, group 
comparisons using item sampling (Cronbach, 1963) and criterion 
referencing (Glaser, 1963) ideas deserve consideration. 

Multiple Learning Measures . Part of the problem mentioned in the 
preceding section is that instructional research has been wedded for 
too long to the achievement test as the sole arbiter of theory and 
practice. Hie suggestion that other kinds of dependent variables might 
be used is not new, yet rarely are multivariate data actually collected. 
If collected, rarely are they analyzed in a way that capitalizes on the 
fact that several measures are available on the same subjects. The 
typical approach has been to treat each variable by a separate ANOVA 
and, at most, to examine a table of intercorrelations thereafter. Some 
more satisfying analytical methods do exist but they have not yet been 
fully developed or publicized. Concerted effort by statisticians in 
this critically important area has really only just f egun. It if! 
therefore not surprising that researchers have continued to think in 
terms of single dependent variables and separate analyses. 

Nonetheless, it is increasingly apparent that the concern of 
curriculum research, for example, should be aimed at determining the 
effects of an instructional program, not merely its effectiveness 
(Cronbach, 1965). Acceptance of this revised goal creates a need for 
new coethodology capable of handling a multidimensional conception of 
curriculum evaluation. In other areas of instructional research also, 
it is clear that the outcomes of instruction are potentially many and 
that a given stimulus variable may affect one kind of dependent variable 
and not another, or perhaps even affect the intercorrelation between 
them rather than the mean of either. 

The most obvious instance of multiple criteria is the use of both 
response correctness and response latency. Another example might be the 
measurement of attitudinai or interest changes as well as cognitive 
changes due to instruction. But one can go a good deal further, 
especially if the multiple measures are also conceived as repeated 
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measures. Some other possibilities are: various kinds of teacher or 

student ratings, quantified characteristics of student essays or 
extemporaneous written or oral classroom behavior, disciplinary or 
attendance of tardiness records* library and special resource usage, 
extent of various outside*of -school activities, etc*, etc-. An 
occasions X variables X students data cube such as that pictured in 
Table 1 could provide the basis for extensive analyses of teacher or 
curriculum effects and even studies in what might be called the 
ecology of classroom behavior. 

Aptitude Output measures . Most of what was said about multiple 
learning measures is also true for aptitude output measures, so little 
more need be added about them. In fact, the distinction between the 
two classes, though deemed helpful, may be a bit arbitrary. The 
former class was defined to contain variables arising within an 
instructional treatment and readily adaptable to repeated measurement 
as well. The latter class refers to measures administered after the 
close of the treatment. 

Presumably, long-term, relatively permanent effects of instruction 
are reflect* i in retention, transfer, and aptitude measures. Many 
instructional objectives, particularly those of the new curricula are 
in fact stated in transfer or aptitudinal terms (Cronbach, 1965). In 
some cases, new forms of instruction can be compared with older methods 
only in these terms, s*.nce no achievement test can be constructed 
without bias toward one or the other kind of content. 

Until recently, such investigations used either separate ANOVAs 
or evaluated the retention or transfer effects after controlling 
immediate learning effects by covariance analysis. Although advocates 
by some authorities, this post -treatment covariance procedure cannot 
be recommended here. As in the case of multiple learning measures, 
the newer developments in multivariate AIWA should supplant older, 
less complete modes of analysis. The future should also see increased 
use of methods for comparing aptitude batteries administered before 




and after instruction. 
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bum maty . In summary, four emphases have been suggested as 
central concerns f«r either theoretically or practically oriented 
research. If instructional methods and processes are to be understood 
and improved, a much clearer conception is needed of: 

1) the nature of student aptitudes as they interact with 
teaching and learning processes , 

2) the course or patterning of these processes across the 
occasions on which they occur, 

3) the extensity of instructional effects as well as the 
intensity of any one effect, and 

4) the enduring changes and subsequent effects of learning 
relative to the pattern of intellectual development in 
general. 

Research aimed at these goals must move from the upper hali to 
the lower half ~f Table I for its methodology. This multivariate 
conception of learner input and learning outcome taxes our knowledge 
of traditional experimental design, while offering the hope that 
newer designs commensurate with th« richness of classroom behavior 
will be increasingly available and increasingly applied. 
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